Automatic recognition of historical handwritten manuscripts is a dauntingtask due to paper degradation over time. Recognition-free retrieval or wordspotting is popularly used for information retrieval and digitization of thehistorical handwritten documents. However, the performance of word spottingalgorithms depends heavily on feature detection and representation methods.Although there exist popular feature descriptors such as Scale InvariantFeature Transform (SIFT) and Speeded Up Robust Features (SURF), the invariantproperties of these descriptors amplify the noise in the degraded documentimages, rendering them more sensitive to noise and complex characteristics ofhistorical manuscripts. Therefore, an efficient and relaxed feature descriptoris required as the handwritten words across different documents are indeedsimilar, but not identical. This paper introduces a Radial Line Fourier (RLF)descriptor for handwritten word representation, with a short feature vector of32 dimensions. A segmentation-free and training-free handwritten word spottingmethod is studied herein that relies on the proposed Radial Line Fourier (RLF)descriptor, taking into account different keypoints representations and using asimple preconditioner-based feature matching algorithm. The effectiveness ofthe proposed RLF descriptor for segmentation-free handwritten word spotting isempirically evaluated on well-known historical handwritten datasets usingstandard evaluation measures.
展开▼
机译:由于纸张随时间推移而退化,因此自动识别历史手写手稿是一项艰巨的任务。无识别检索或单词聚类通常用于历史手写文档的信息检索和数字化。然而,词点测算法的性能在很大程度上取决于特征检测和表示方法,尽管存在流行的特征描述符,例如Scale InvariantFeature Transform(SIFT)和Speeded Up Robust Features(SURF),但这些描述符的不变属性会放大降级中的噪声。文档图像,使其对噪声和历史手稿的复杂特征更加敏感。因此,需要有效且宽松的特征描述符,因为跨不同文档的手写单词的确是相似的,但不完全相同。本文介绍了一种用于手写单词表示的径向线傅立叶(RLF)描述符,具有32维的短特征向量。本文研究了一种无分段和无训练的手写单词发现方法,该方法依靠拟议的径向线傅立叶(RLF)描述符,同时考虑了不同的关键点表示并使用了基于简单预处理器的特征匹配算法。拟议的RLF描述符在无分段手写单词发现中的有效性是使用标准评估方法在知名历史手写数据集上进行经验评估的。
展开▼